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In this paper we present the application of a novel methodology to scientific cita- 
tion and collaboration networks. This methodology is designed for understanding the 
governing dynamics of evolving networks and relies on an attachment kernel, a scalar 
function of node properties, which stochastically drives the addition and deletion of 
vertices and edges. We illustrate how the kernel function of a given network can be 
extracted from the history of the network and discuss other possible applications. 
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1.1 Introduction 

The network representation of complex systems has been very successful. The 
key to this success is universality in at least two senses. First, the simplicity of 
representing complex systems as networks makes it possible to apply network 
theory to very different systems, ranging from the social structure of a group to 
the interactions of proteins in a cell. Second, these very different networks show 
universal structural traits such as the small-world property and the scale-free 
degree-distribution O dj ■ See ^ E] for reviews of complex network research. 

Usually it is assumed that the life of most complex systems is defined by some 
- often hidden and unknown - underlying governing dynamics. These dynamics 
are the answers to the question 'How does it work?' and a fair share of scientific 
effort is taken to uncover this dynamics. 

In the network representation the life of a (complex or not) system is modeled 
as an evolving graph: sometimes new vertices are introduced to the system while 
others are removed, new edges are formed, others break and all these events are 
governed by the underlying dynamics. See |SJ 1121 [3] for data-driven network 
evolution studies. 

This paper is organized as follows. In Section HI. 21 we define a framework 
for studying the dynamics of two types of evolving networks and show how 
this dynamics can be measured from the data. In Section HOI we present two 
applications and finally in Section HI .41 we discuss our results and other possible 
applications. 

1.2 Modeling evolving networks by attachment 
kernels 

In this section we introduce a framework in which the underlying dynamics of 
evolving networks can be estimated from knowledge of the time dependence of 
the evolving network. 

This framework is a discrete time model, where time is measured by the 
different events happening in the network. An event is a structural change: 
vertex and/or edge additions and/or deletions. The interpretation of an event 
depends on the system we're studying; see Section HI. 31 of this paper for two 
examples. 

The basic assumption of the model is that edge additions depend on some 
properties of the vertices of the network. This property can be a structural one 
such as the degree of a vertex or its clustering coefficient but also an intrinsic 
one such as the age of a person in a social network or her yearly net income. 
The model is independent of the meaning of these properties. 

The vertex properties drive the evolution of the network stochastically 
through an attachment kernel, a function giving the probabilities for any new 
edges which might be added to the network. See [5] for another possible appli- 
cation of attachment kernels. 
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In this paper we specify the model framework for two special kinds of net- 
works: citation and non-decaying networks, more general results will be pub- 
lished in forthcoming publications. 

1.2.1 Citation networks 

Citation networks are special evolving networks. In a citation network in each 
time step (event) a single new node is added to the network together with its 
edges (citations). Edges between "old" nodes are never introduced and there 
are no edge or vertex deletions either. 

For simplicity let us assume that the A(-) attachment kernel depends on a 
only one property of the potentially cited vertices, their degree. (The formalism 
can be generalized easily to include other properties as well.) We assume that 
the probability that at time step t an edge e of a new node will attach to an old 
node i with degree di is given by 

P[ e cites i] = (i.i) 

The denominator is simply the sum of the attachment kernel functions evaluated 
for every node of the network in the current time step. 

With this simple equation the model framework for citation networks is de- 
fined: we assume that in each time step a single new node is attached to the 
network and that it cites other, older nodes with the probability given by (|l.lfl . 

For a given citation network we can use this model to estimate the form of 
the kernel function based on data about the history of a network. In this paper 
we only give an overview of this estimation process, please see [7] for the details. 

Based on the probability that an edge e of a new node at time t cites 
an old node with degree d is given by 

P[e cites a rf-degree node] = P e (d) = ^f^ , S(t) = V A(d k (t)) 

k=i 

(1-2) 

Nd(t) is the number of d-degree nodes in the network in time step t. From here 
we can extract the A{d) kernel function: 

= PMSjt) 

V ' N d (t) V ' 

If we know S(t) and Nd(t), then by estimating P e (d) based on the network data 
we have an estimate for A(d) via (|1.3I) . and by doing this for each edge and d 
degree, in practice we can have a reasonable approximation of the A(d) function 
for most d values. (Of course we cannot estimate A(d) for those degrees which 
were never present in the network.) 

It is easy to calculate Nd(t) so the only piece missing for the estimation is 
that we need S(t) as well; however this is defined in terms of the measured 
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A(d) function. We can use an iterative approach to make better and better 
approximations for A(d) and S(t). First we assume that So(t) = 1 for each t 
and measure A (d) which can be used to calculate the next approximation of 
S(t), Si(t) yielding A\{t) via the measurement, etc. In practice this procedure 
converges quickly for the systems we have studied - after five iterations the 
difference between successive A n (d) and A n+ i(d) estimations is very small. 



1.2.2 Non-decaying networks 

Non-decaying networks are more general then citation networks because connec- 
tions can be formed between older nodes as well. It is still true, however, that 
neither edges nor nodes are ever removed from the network. 

Similarly to the previous section, we assume that the attachment kernel 
depends on the degree of the vertices, but this time on the degree of both vertices 
involved in the potential connection. The probability of forming an edge between 
nodes i and j in time step t is given by 

P[i and j will be connectedl = — A (di(t) , dj (t)) 

The denominator is the sum of the attachment kernel function applied to all 
possible (not yet realized) edges in the network. Ofcj(t) is 1 if there is an edge 
between nodes k and I in time step t and otherwise. 

Using an argument aimilar to that of the previous section we can estimate 
A(d*,d**) via 

P[e connects d* and d** degree nodes] = P e (d*,d**) = ^-^ ' - ' d — — , 

(1.5) 

Nd*,d**(t) is the number of not yet realized edges between d* and d** degree 
nodes in time step t, and 

N(t)N(t) 

5 W = E £(l-a«(*))A(d fc (t),d,(t)) (1.6) 

A( » Pe(d*,d**)S(t) 

S(t) can be approximated using an iterative approach similar to that intro- 
duced in the previous section. 



1.3 Applications 

In this section we briefly present results for two applications for the model frame- 
work and measurement method. For other applications and details see 
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1.3.1 Preferential attachment in citation networks 

The preferential attachment model |S] gives a mechanism to generate the 
scale-free degree-distribution often found in various networks. In our framework 
for citation networks it simply means that the kernel function linearly depends 
on the degree: 

A(d)=d + a, (1.8) 

where a is a constant. 

By using our measurement method, it is possible to measure the kernel func- 
tion based on node degree for various citation networks and check whether they 
evolve based on this simple principle. 

Let us first consider the network of high-energy physics papers 
from the arXiv e-print archive. We used data for papers submit- 
ted between January, 1992 and July, 2003, which included 28632 pa- 
pers and 367790 citations among them. The data is available online 
at http://www.cs.cornell.edu/projects/kddcup/datasets.html This 
dataset and other scientific citation networks are well studied, see ^3 for 
examples. 

First we've applied the measurement method based on the node degree to 
this network and found that indeed, the attachment kernel of the network is 
close to the one predicted by the preferential attachment model, that is 

^hepM-^ + I (1-9) 

gives a reasonably good fit to the data. See the measured form of the kernel in 
Fig.O 

The small exponent for d is in good agreement with the fact that the degree 
distribution of this network decays faster than a power-law. 

Next, we've applied the measurement method by using two properties of the 
potentially cited nodes: their degree and age, the latter is simply defined as the 
difference of the current time step and the time step when the node was added. 
We found that the two variable A(d, a) attachment kernel has the following form: 

A*^ EP (d 7 a) = (d 1 ' 14 + l)a- 1 - 14 . (1.10) 

This two- variable attachment kernel gives a better understanding of the dynam- 
ics of this network: the citation probability increases about linearly with the 
degree of the nodes and decreases as a power-law with their age. Note that 
these two effects were both present in the degree-only dependent A* attach- 
ment kernel, this is why the preferential attachment exponent was smaller there 
(0.85 < 1.14). 

Similar results were obtained for the citation network of US patents granted 
between 1975 and 1999 containing 2,151,314 vertices and 10,565,431 edges: 

4ateat(rf) = ^ , ^^(d, a) = (d 1 ' 2 + 1) a" 1 ' 6 . (1.11) 

These two studies show that the preferential attachment phenomenon can 
be present in a network even if it does not have power-law degree-distribution 
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Figure 1.1: The two measured kernel functions for the HEP citation network. The 
left plot shows the degree dependent kernel, the right the degree and age dependent 
kernel. On the right plot four sections along the degree axis are shown for different 
vertex ages. 



because there is another process - aging in our case - which prevents nodes from 
gaining very many edges. 

1.3.2 The dynamics of scientific collaboration networks 

In this section we briefly present the results of applying our methods to a non- 
decaying network: the cond-mat collaboration network. In this network a node 
is a researcher who published at least one paper in the arXiv cond-mat archive 
between 1970 and 1997 (this is the date when the paper was submitted to cond- 
mat, not the actual publication date, but most of the time these two are almost 
the same). There is an edge between two researchers/nodes if they've published 
at least one paper together. The data set contains 23708 papers, 17636 authors 
and 59894 edges. 

We measured the attachment kernel for this network based on the degrees of 
the two potential neighbors. See Fig. II. 21 for the A con d-mat(d* , d**) function. 

We've tried to fit various functional forms to the two-dimensional attachment 
kernel function to check which is a better description of the dynamics. See 
Fig. 11.31 for the shape of the fitted functions and Table 11.11 for the functional 
forms and the results. 

The best fit was obtained by 

A' cond _ mat (d*,d**) = Cl • (d*d**) C2 + c 3 (1.12) 

where the Cj are constants. 

See for other studies on collaboration networks. 



1.4 Discussion 



We've briefly presented a methodology for understanding the evolution of net- 
works through kernel functions and showed how the kernel functions can be 
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Figure 1.2: The attachment kernel for the cond-mat collaboration network, the 
surface plot was smoothed by applying a double exponential smoothing kernel to it. 
The right plot has logarithmic axes. The right plot shows that the kernel function has 
high values for zero-degree nodes, this might be because a new researcher will usually 
write a paper with collaborators and thus will have a high probability of adding links 
to the network. 

extracted from network data. 

We've discussed two applications for this methodology: first the "fitting" of 
the preferential attachment model to a network of scientific citations and then 
determining how the evolution of a scientific collaboration network depends on 
the degree of the vertices. 

The methodology outlined here is general and can be successfully applied to 
any kind of evolving network where time dependent data is available. By defin- 
ing the kernel function in terms of the potentially important vertex properties 
one can check whether these properties really significantly influence network evo- 
lution: if a kernel function is not sensitive to one of its arguments that suggests 
that this argument does not have an important contribution. Another possible 
application would be to identify changes in the dynamics of a system by doing 
the measurements in sliding time windows, see jS] for an example. 
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Figure 1.3: A shows the smoothed measured kernel function for the collaboration 
network, B, C and D are fitted functional forms shown in the first three lines of 
Table [T"T1 The best fit is clearly obtained by the multiplicative fit. 



Fitted form 


Fitted parameters 


Fit Error 


Fitting method 


B 


ci max(<i* , d** ) + c 2 


ci = 1.26, c 2 = -10.56 


107357.6 


Nelder-Mead 


C 


c\d*d** + c 2 


ci = 0.0697, c 2 = -2.11 


4300.2 


Nelder-Mead 


D 


ci{d* +d**) + c 2 


ci = 1.08, c 2 = -18.98 


31348.9 


Nelder-Mead 




Cl d*d** +c 2 (d* +d**)+ 


ci = 0.0783, c 2 = -0.12, 


3532.9 


BFGS 




+C3 max(<i*, d**) + C4 


c 3 = -0.093, c 4 = 1.50 








Cl {d*d**fi +c 3 


ci = 0.016, c 2 = 1.22, 


3210.4 


SANN 






c 3 = 0.58 







Table 1.1: Four optimization methods were run for each functional form to minimize 
the least square difference: BFGS, Nelder-Mead, CG and SANN, the results of the 
best fits are included in the table. See |1 II B) for the details of these methods. 
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